Biomedical ontology improves biomedical literature clustering performance: a comparison study

نویسندگان

  • Illhoi Yoo
  • Xiaohua Hu
  • Il-Yeol Song
چکیده

Document clustering has been used for better document retrieval and text mining. In this paper, we investigate if a biomedical ontology improves biomedical literature clustering performance in terms of the effectiveness and the scalability. For this investigation, we perform a comprehensive comparison study of various document clustering approaches such as hierarchical clustering methods, Bisecting K-means, K-means and Suffix Tree Clustering (STC). According to our experiment results, a biomedical ontology significantly enhances clustering quality on biomedical documents. In addition, our results show that decent document clustering approaches, such as Bisecting K-means, K-means and STC, gains some benefit from the ontology while hierarchical algorithms showing the poorest clustering quality do not reap the benefit of the biomedical ontology.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Coherent Biomedical Literature Clustering and Summarization Approach Through Ontology-Enriched Graphical Representations

In this paper, we introduce a coherent biomedical literature clustering and summarization approach that employs a graphical representation method for text using a biomedical ontology. The key of the approach is to construct document cluster models as semantic chunks capturing the core semantic relationships in the ontology-enriched scale-free graphical representation of documents. These documen...

متن کامل

Onto-clust - A methodology for combining clustering analysis and ontological methods for identifying groups of comorbidities for developmental disorders

Children with developmental disorders usually exhibit multiple developmental problems (comorbidities). Hence, such diagnosis needs to revolve on developmental disorder groups. Our objective is to systematically identify developmental disorder groups and represent them in an ontology. We developed a methodology that combines two methods (1) a literature-based ontology that we created, which repr...

متن کامل

Biomedical Ontologies and Text Mining for Biomedicine and Healthcare: A Survey

In this survey paper, we discuss biomedical ontologies and major text mining techniques applied to biomedicine and healthcare. Biomedical ontologies such as UMLS are currently being adopted in text mining approaches because they provide domain knowledge for text mining approaches. In addition, biomedical ontologies enable us to resolve many linguistic problems when text mining approaches handle...

متن کامل

Fuzzy C-Means Clustering for Biomedical Documents Using Ontology Based Indexing and Semantic Annotation

Search is the most obvious application of information retrieval. The variety of widely obtainable biomedical data is enormous and is expanding fast. This expansion makes the existing techniques are not enough to extract the most interesting patterns from the collection as per the user requirement. Recent researches are concentrating more on semantic based searching than the traditional term bas...

متن کامل

Mining and its Application in Biomedical Domain

Semantic Text Mining and its Application in Biomedical Domain Illhoi Yoo Xiaohua Hu, Ph.D A huge amount of biomedical knowledge and novel discoveries have been produced and collected in text databases or digital libraries, such as MEDLINE, because the most natural form to store information is text. In order to cope with this pressing text information overload, text mining is employed. However, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • International journal of bioinformatics research and applications

دوره 3 3  شماره 

صفحات  -

تاریخ انتشار 2007